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Abstract 

We consider a version of large population games whose agents compete for resources using strate- 
gies with adaptable preferences. The games can be used to model economic markets, ecosystems or 
distributed control. Diversity of initial preferences of strategies is introduced by randomly assigning 
biases to the strategies of different agents. We find that diversity among the agents reduces their 
maladaptive behavior. We find interesting scaling relations with diversity for the variance and other 
parameters such as the convergence time, the fraction of fickle agents, and the variance of wealth, 
illustrating their dynamical origin. When diversity increases, the scaling dynamics is modified by 
kinetic sampling and waiting eff'ects. Analyses yield excellent agreement with simulations. 

PACS numbers: 02.50.Le, 05.70.Ln, 05.40.-a 
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I. INTRODUCTION 



Many natural and artificial systems involve interacting agents, each making indepen- 
dent decisions to compete for limited resources, but globally exhibit coordinated behavior 
through their mutual adaptation 

mm 

. Examples include the formation of ecological 
patterns due to the competition of predators hunting for food, the price adjustment due to 
the competition of buyers or sellers in economic markets, and the load adjustment due to 
the competition of distributed controllers of packet flows in computer networks. While a 
standard approach is to analyse the steady state behavior of the system described by the 
Nash equilibria P] , it is legitimate to consider how the steady state is approached, since such 
processes are dynamical in nature, and the approach may be interfered by the presence of 
periodic, chaotic or metastable attractors. Dynamical studies are especially relevant when 
one considers the effects of changing environment, such as that in economics or distributed 
control. 

The recently proposed Minority Games (MG) are prototypes of such multi-agent systems 
P]. Extensive studies have revealed the steady-state properties of the game when the com- 
plexity of the agents is high ^6]. On the other hand, the dynamical nature of the adaptive 
processes is revealed when the complexity of the agents is low, wherein the final states of 
the system depend on the initial conditions, and the system often ends up with large fluctu- 
ations at final states, much remote from the efficient state predicted by equilibrium studies 
y, . The large fluctuations in the original MG is related to the uniformly zero preference of 
strategies for all agents. This has to be re-examined for at least two reasons. First, when the 
game is used to model economic systems, it is not realistic to expect that all agents have the 
same preference when they enter the market. Rather, the agents have their own preferences 
according to their individual objectives, expectations and available capital. For example, 
some have stronger inclinations towards aggressive strategies, and others more conservative. 
Furthermore, in games which use public information only, identical initial preferences imply 
that different agents would maintain identical preferences of strategies at all subsequent 
steps of the game, which is again unlikely. Second, when the game is used to model dis- 
tributed control in multi-agent systems, identical preferences of strategies of the agents lead 
to maladaptive behavior, which refers to the bursts of the population's decisions due to the 
agents' premature rush to certain state . As a result, the population difference between 
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the majority and minority groups is large. For economic markets, this corresponds to large 
price fluctuations; for distributed control, this corresponds to an uneven resource allocation; 
both imply low system efficiency. Hence, maladaptation hinders the attainment of optimal 
system efficiency. 

There have been many att emp ts to improve the system efficiency. For example, thermal 
noise |3| or biased strategies jll| are found to reduce the fluctuations. More relevant to this 
work, there were indications that maladaptation can be reduced by appropriate choices of 
the initial condition at the low complexity phase. The dependence of initial conditions was 
noted in the replica approach to the exogenous MP Q. System efficiency can be improved by 
random initial conditions in the original MG [12], or systems driven by vectorized external 
information 7]. It was noted that the reduced variance can be obtained hysteretically 
by quasistatic increase and decrease of the complexity from an unbiased initial condition, 
clearly demonstrating the non-equilibrium nature of this phenomenon By generalizing 
the strategy evaluation mechanism to the batch mode, and using a payoff function linear in 
the winning margin, the generating functional analysis showed that fluctuations are reduced 
by biased starts of the agents' strategy payoff valuations • The same is valid in its noisy 
extension However, no systematic studies about the effects of random biases have been 
made. 

In this paper, we consider the effects of randomness in the initial preferences of strategies 
among the agents. Initial conditions can be selected to make the system dynamics completely 
deterministic, thus yielding highly precise simulation results useful for reflned comparison 
with theories. As we shall see, a consequence of this diversity is that agents sharing common 
strategies are less likely to adopt them at the same time, and maladaptation is reduced. 
This results in an improved system efficiency, as reflected by the reduced variance of the 
population decisions. We flnd interesting scaling relations with the diversity for the variance, 
and a number of dynamical parameters, such as the convergence time, the fraction of flckle 
agents, and the variance of wealth, illustrating their dynamical origin. When diversity 
increases, we flnd that the scaling dynamics is modifled by a sampling mechanism self- 
imposed by the requirement of the dynamics to stay in the attractor, an effect we term 
kinetic sampling. Preliminary results have been sketched in 

This paper is organized as follows. After introducing the Minority Game in Section m 
we discuss the variation of fluctuations when diversity increases, identifying 3 regimes of 
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behavior: multinomial, scaling, and kinetic sampling, analyzed in Sections to respec- 
tively. Besides the fluctuations, other dynamical properties, namely, the fraction of fickle 
agents, the convergence time, and the variance of wealth, are discussed in Sections |VT1 to 
I Villi respectivelv. The paper is concluded in Section HXl 

II. THE MINORITY GAME 

We consider a population of agents competing selfishly to be in the minority group 
in an environment of limited resources, being odd . Each of the N agents can make a 
decision 1 or at each time step, and the minority group wins. For typical control tasks such 
as the distribution of shared resources, the decisions 1 and may represent two atternative 
resources, so that less agents utilizing a resource implies more abundance. For economic 
markets, the decisions 1 and correspond to buying and selling respectively, so that the 
buyers can win by belonging to the minority group, as a consequence of the price being 
pushed down when supply is greater than demand, and vice versa. 

Each agent makes her decision independently according to her own finite set of strategies, 
randomly picked before the game starts. Each of her s strategies is based on the history of 
the game, which is the time series of the winning bits in the most recent m steps. Hence, 
m is the memory size. There are D = 2"^ possible histories, thus D is the dimension of the 
strategy space. While most previous work considered the case D ^ N, we will mainly study 
the case m > 1 in this paper. As we shall see, this simplification enables us to make detailed 
analysis of the system, revealing many new features. 

A strategy is then a Boolean function which maps each of the D histories to decisions 1 
or 0. Denoting the winning state at time t by a(t) {(j{t) = 1,0), we can convert an m-bit 
history a{t — m + 1), ■ • • , a{t) to an integer historical state /U*(t) of modulo D, given by 



and the Boolean decisions of strategy a responding to input state /i are denoted by o"^ = 1,0, 
corresponding to the binary decisions = ±1 via = 2cr^ — 1. For subsequent analyses 
of strategies, the label a of a strategy is given by an integer between and 2^ — 1, where 
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The success of a strategy is measured by its cumulative payoff (also called virtual point 
in the literature), which increases (decreases) by 1 if it indicates a winning (losing) decision 
at a time step. Note that the payoffs attributed to the strategies at each step depend only 
on the signs of the decisions, and is independent of the magnitude of the winning margins. 
This is called the step payoffs and follows the original version of the MG P]. Many recent 
studies used payoffs with magnitudes increasing with the difference between the majority 
and minority population. In particular, payoffs that are linear in the population difference 
are called linear payoffs, and are found convenient in the application of analytical techniques 
such as the replica method 0] or the generating functional analysis In the analysis of 
this paper, the step payoff is more convenient. 

At each time step, each agent chooses, out of her s strategies, the one with the highest 
cumulative payoff (updated every step irrespective of whether it is adopted or not) and 
makes decisions accordingly. The difference between the total number of winning and losing 
decisions of an agent up to a time step is called her wealth at that time. The long-term goal 
of an agent is to maximize her wealth. 

To model diversity among the agents, the agents may enter the game with diverse pref- 
erences of their strategies. This means that each agent has random integer biases to the 
initial cumulative payoffs of each of her s strategies. We are interested in how the extent 
of randomness affects the system behavior, and there are many choices of the bias distribu- 
tion. A natural choice is the multinomial distribution, which can be modeled by assigning 
integer biases to the s strategies of each agent, which add up to an odd integer R. Then, 
the biased payoff of a strategy of an agent obeys a multinomial distribution with mean R/ s 
and variance R{s — l)/s^. The ratio p = R/N is referred to as the diversity. 

For the binomial case s = 2 and odd R, which will be studied here, no two strategies 
have the same cumulative payoffs throughout the game. Hence there are no ties, and the 
dynamics of the game is deterministic, resulting in highly precise simulation results useful 
for refined comparison with theories. This is in contrast with previous versions of the game, 
which correspond to the special case of i? = 0. 

Furthermore, for an agent holding strategies a and b (with a < b), the biases affect her 
decisions only through the bias difference u of strategy a with respect to b. Hence we let 
Sab{^) be the number of agents holding strategies a and b, where the bias of strategy a is 
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displaced by uj with respect to 6, and its disordered average is 

in , .^ ^ ^ ( R\ /nN 

= ^^^I^^J- (3) 
To describe the macroscopic dynamics of the system, we define the D-dimensional phase 
space with the components A^{t\ which is the fraction of agents making decision 1 respond- 
ing to input /i of their used strategies, subtracted by that for decision 0. While only one of 
the D components corresponds to the historical state of the system, the augmentation 
to D components is necessary to describe the attractor structure and the transient behavior 
of the system dynamics. 

The key to analysing the system dynamics is the observation that the cumulative payoffs 
of all strategies displace by exactly the same amount when the game proceeds, though their 
initial values may be different. Hence for a given strategy pair, the profile of the cumulative 
payoff distribution remains binomial, but the peak position shifts with the game dynamics. 
Hence once the cumulative payoffs are known, the state location in the D-dimensional phase 
space is given by 

a<6,u) 

a 

where ^aif) is the cumulative payoff of strategy a at time t, Sa is the number of agents 
holding 2 identical strategies labelled a, and 6(x) is the step function of x. For agents 
holding non-identical strategies a < b, the agents make decision according to strategy a if 
u + Qa(t) — ^b(t) > 0, and strategy b otherwise. Hence u + Qa(t) — ^b(t) is referred to as the 
preference of a with respect to b. In turn, the cumulative payoff of a strategy a is updated 
by 

Qa{t + 1) = Qait) - e*«sgnA'^*W(t). (5) 

Fig. ^a) illustrates the convergence to the attractor for the visualizable case of m = 
1. The dynamics proceeds in the direction which tends to reduce the magnitude of the 
components of A'^{t) [6J. However, a certain amount of maladaptation always exists in the 
system, so that the components of ANt) overshoot, resulting in periodic attractors of period 
2D, as reported in the literature The state evolution is given by the integer equation 

+ = mod(2/x*(t) +(T(t),Z^), (6) 
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so that every state appears as historical states two times in a steady-state period, with 
a{t) appearing as and 1, each exactly once. One occurence brings from positive to 
negative, and another bringing it back from negative to positive, thus completing a cycle. 
The components keep on oscillating, but never reach zero. This results in an antipersistent 
time series For the example in Fig. HJ^a), the steady state is described by the sequence 

^^{t) = ait) =0,1,1,0, (7) 

where one notes that both states and 1 are followed by and 1 once each. 
For m = 2, there are 2 attractor sequences as shown in Fig. ^b), 

/i(t) = 0,l,3,3,2,l,2,0, (8) 

and 

/i(t) = 0,l,2,l,3,3,2,0. (9) 

Again, one notes that each of the states 0, 1, 2, 3 are followed by an even {<j(t) = 0) and 
an odd state {cr{t) = 1) once each. Furthermore, we note that the attractor sequences in 
Eqs. (jHI) and © are related by the conjugation symmetry fi(t) 3 — /i(t). For general values 
of m, an attractor sequence can be obtained by starting with the state /i*(0) = cr(0) = 0, 
and assigning a{t) = 1 if the value of yU*(t) appears the first time in the sequence, and 
the second time, such as the attracters in Eq. ((7j) and (jH)). In general, other attractor 
sequences can be obtained by computer search, and the number of attractor sequences can 
be verified to be 2^ /2D, which forms the de Bruijn sequence in terms of m, corresponding 
to the number of distinct ring configurations of length 2D, for which all sub-strings of length 
m + 1 are distinct [2^ . 

The population averages of the decisions oscillate around at the steady state. Since a 
large difference between the majority and minority populations implies inefficient resource 
allocation, the inefficiency of the game is often characterized by the normalized variance 
a'^/N of the population making decision 1 at the steady state. Since this population size at 
time t is given by A^(l — A^*^^'^)/2, we have 

^ = }nn^{[A^*{t)-{A^'{t)),r)„ (10) 

where ( )t denotes time average at the steady state. 
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FIG. 1: (a) The state motion of a sample in the phase space for m = 1, s = 2, = 1023 and 
R = 16383. Empty dots: transient states. Sohd dots: attractor states, (b) The attractors in the 
phase subspace of and for m = 2. 6 of the 8 states remain in the second quadrant of the 
subspace formed by A^ and A^. The location of the other 2 states are indicated in the A^ and A^ 
subspace, instead of the A^ and subspace. The numbers in the circles denote the elements of 
the attractor sequences in Eqs. © and ©. 

As shown in Fig.|21 the variance a'^/N of the population for decision 1 scales as a function 
of the complexity a = D/N, agreeing with previous observations j^. When a is small, games 
with increasing complexity create time series of decreasing fluctuations. A phase transition 
takes place around ac ~ 0.3, after which it increases gradually to the limit of random 
decisions, with cr'^/N = 0.25. When a < ac, the occurences of decision 1 and responding 
to a given historical state fj. are equal, and is referred to as the symmetric phase 1211]. On 
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FIG. 2: The dependence of the variance of the population making decision + on the complexity 
for different diversities at s = 2 averaged over 128 samples. The horizontal dotted line is the limit 
of random decisions. 

the other hand, in the asymmetric phase above Oc, the occurences of decisions are biased 
for at least some history fi. 

Figure 121 also shows the data collapse of the variance for different values of diversity p. It 
is observed that the variance decreases significantly with diversity in the symmetric phase, 
and remains unaffected in the asymmetric phase I22I. Furthermore, for a game efficiency 
prescribed by a given variance cr'^/N, the required complexity of the agents is much reduced. 

The dependence of the variance on the diversity is further shown in Figs. El and El for 
memory sizes m = 1 and m = 2 respectively. The following three regimes can be identified 
and explained in Sections IIIII to IVl respectivelv: (a) multinomial regime: when p ~ A^~^, 
a'^/N ~ with proportionality constants dependent on m; (b) scaling regime: when p ~ 1, 
a'^/N ~ p^^ with proportionality constants independent of m for m not too large; (c) 
kinetic sampling regime: when p ~ A^, a'^/N deviates above the scaling with p~^ due to 
kinetic sampling effects as explained below, and the scahng is given by a^/N ~ fm{A)/N, 
where A is the kinetic step size given by 

and fm is a function dependent on the memory size m. 

To analyse the behavior in these regimes, we derive the following expression for the step 
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FIG. 3: The dependence of the variance of the population making decision + on the diversity at 
m = 1 and s = 2. Symbols: simulation results averaged over 1024 samples. Solid lines: theory. 
Dashed-dotted line: scaling prediction. 




FIG. 4: The dependence of the variance of the population making decision + on the diversity at 
m = 2 and s = 2. Notations are the same as those of Fig. |3J Inset: A comparison of the variances 
at m = 1 and m = 2 in Figs. |21 and [l] 
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AAf'it) = Af'it + 1) - A^'it), at time t. Using Eq. (|H), we have 
AA''{t) = ^ J2 Sab{u;){Q[u; + na{t+l)-nb{t + l)]-Q[LJ + Qa{t)-nt{m{C-0- (12) 

a<b,LU 

Since the arguments of the step functions are odd integers, nonzero contributions to Eq. ()12p 
come from terms with L<j+f2a(t+l)—i7(,(t+l) = ±1 a.nduj+Qa{t)—^lb(t) = =Fl- Using Eq. (0), 
the two arguments differ by — (^^ — ^^)sgnA^{t) with /i = fi*{t). Hence the conditions for 
nonzero contributions become equivalent to uj+fla{t) — ^b{t) = =Fl and — = T'^sgnA^^{t) 
for /i = This reduces the steps to 

^^''W = ^ E 5afeH5(^ + f^a(t)-^^.(t)±l)5(e-C±2sgnA'^(t))(±)(e-e,0, (13) 

a<b,ui,i: 

where fi = fi*(t), and 6{n) = 1 if n = 0, and otherwise. For /x = /i*(t), this can be further 
simphfied to 

AA^(t) = -sgnA'^it)^ Yl ^-"(^^ - ^-(^^ + ^^(tM^a - ± SsgnA^'lt)), (14) 

a<b,± 

To interpret this result, we note that changes in A'^(t) are only contributed by fickle agents 
with marginal preferences of their strategies. That is, those with u + Qa(t) — flpit) = ±1 
and C,a — (.p = =F2sgny4'^(t) for ^ = fj.*{t). Furthermore, the step points in the direction that 
reduces the magnitude of A^(t). 

Similarly, the steps along the direction u other than the historical state fJ'*{t) are given 

by 

^^'(^) = ^ E ^'^"(^1 - + ^^(^M^a - C ± 2sgnA^(t))(±)(e - Cb) (15) 

a<b,± 

where /i = fJ'*{t). This shows that the steps along the non-historical direction are contributed 
by the subset of those fickle agents that contribute to the step along the historical direction, 
and they can be positive or negative. 

Next we consider the disordered average of the steps in Eq. (fT^. For this purpose, it is 
convenient to decompose the cumulative payoffs as 

where k^{t) is the number of wins minus losses of decision 1 up to time t when the game 
responded to history fi. Since there are 2^ variables of fla{t) and D variables of /c^(t), this 
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decomposition greatly simplifies the analysis, and describes explicity how Qait) depends on 
the strategy decisions. Introducing the integral representation of the Kronecka delta for the 
preference, we can factorize the contributions of flait) —flbit) into a product over the states, 

i'2tt in 

5icu + n^{t) - n,it) ± 1) = / — e*^(-±i) n e*'''=^(«"'-«^'), (17) 

Jo 271 -LA 

where the explicit dependence on t is omitted for convenience here and in the subsequent 
derivation. Using the identities 

- ± 2sgnA'^(t)) = \[1T i^a - eDsgnA'^ - ^4"], (18) 
^mii-O = cos^ sin cos + sin^ 0, (19) 

and introducing the average in Eq. 0, we obtain the following factorized expression from 
Eq. (Uni) for /i = fi*{t), 



,., 4- V 2 / 



27r 



[cos^ k^e + (e - C)^ sin k^e cos A;^^ + ^^^^ sin^ A;^^] 

J] [cos^ k^e + iC -^^)^ sin k^O cos A;^^ + ^"4^ sin' k^O] . (20) 

The summation over a < b can now be replaced by half times the independent summations 
over a and b. Noting that for given states /i, . . . A, 

J2^'aC---ea=0, (21) 

a 

we find that all terms in the expansion of Eq. ()20|) vanish if they contain unpaired decisions 
or The final result is 

{AA^{t)) = -sgnA^ / — cos^ 9 cos(2A;„ - sgnA^)^ TT cos' k^O. (22) 

Eq. (j22j) describes the change induced by the payoff component kf^{t) incremented by 
— sgny4'^(t). Since the step size depends on time implicitly through the payoff components, 
the sum of all changes induced by A;^(t) incremented from yields 

(A^(t) - A'^(O)) = / — cos^^ f^— — ^TTcos'M- (23) 

Jo 27r smt' 
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Similarly, the steps along the non-historical direction are given by 



/ — cos 9 sin k,y6 cos Ke sin(2A;^sgnA^ - 1)6 ff cos^ k^e, (24) 
Jo 27r 

where ly ^ f^*{t) = fi. The same result can be obtained from Eq. ()23|1 by considering the 
difference of 2 equations when one of the states labeled u become historical and k^, changes 
by — sgnA'^. 



III. THE MULTINOMIAL REGIME 

When p ~ A^~^, or i? ~ 1, there is a finite number of clusters of agents who make 
identical decisions throughout the game. Since there are many agents in a typical cluster, 
their identical decisions will cause large fluctuations in their behavior. Consider the example 
of m = 1 and R = 1. There are only 4 strategies. For a pair of distinct strategies, there is an 
average of agents picking them, and agents in each cluster with biases ±1. As a 

result, we have a'^/N ~ A^. The proportionality constant depends on m, and is sensitive to 
the profile of the bias distribution. Since we consider the multinomial distribution in Eq. (jSJ 



in this paper, we call this t 
bimodal distribution 



le multinomial regime. Another choice in the literature is the 



IQ El HQ, which havior. 

Consider the case m = 1. Eqs. (j22I) and ^ show that the step size (AA^(t)) ~ C(l) 
and is thus self-averaging. Since ^1^(0) is Gaussian with variance N~^, the values of A^{t) 
at the attractors can be computed to 0{1). Depending on the initial position A(0) = 
{A^{0), A^{0)), 4 attractors can be identified. For example, if A(0) lies in the first quadrant, 
and the initial historical state is 0, then the payoff components k = {k^ (t) , k^ (t)) at the 
attractor are given by k(0) = (0,0), k(l) = (-1,0), k(2) = (-1,-1), k(3) = (-1,0), 
provided that when AA^^it) = to order 1, AAf^{t) is also equal to to order N'^/"^. 
Analysis can be simplified by noting that when the payoff components k^j,{t) are restricted 
to the values and ±1, Eq. (^Hj) can be written as 

A^{t) = k, r |(cos^)[--^^-^^S^..I^^'] = (25) 

where c„ = '^'"^{^/'i) even integer n, and we have used the facts that A^{t) is self- 
averaging, ^'^(O) ~ N^^^"^. The locations of the 4 attractors are shown in Fig. El and 
summarised in Table HI 
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FIG. 5: (a-d) The 4 attractors for m = 1 and s = 2 in the multmomial regime. The time steps 
are relabeled with t = corresponding to the state with /^*(t) = and //*(t + !) = !. 

The variance of A'^{t) of the historical states fi = fi*{t), averaged over the period for each 
of the 4 attractors, can be obtained from Table |lj The variance of decisions in Eq. ()lUp. 
averaged over the 4 attractors, is then given by 

^2 N 



a 



iV " 128 " 2cr+iCr+3 + 7c^+3) 



(26) 



The theoretical values are compared with simulation results for the first 3 points of each 
curve corresponding to given values of in Fig. El The agreement is excellent. Note that 
the variance in this regime deviates from the scaling relation with in the next regime, 
as evident from the splaying down from the linear relation in Fig.|21 However, when 1, 
Cr+i ~ ^ A/2/7r_R, cr'^/N reduces to 3/167rp, showing that the deviation from the 
scaling gradually vanishes. 

Now consider the case m = 2. Starting from initial positions near the origin of the 4- 
dimensional phase space, we consider the attractors resulting from the 16 quadrants and 4 
initial states. We find 16 attractors for the attractor sequence in Eq. (jH}. The positions of one 
of the attractors are summarised in Table ITTl and the values of A'^{t) for the historical states 
fi = fi*{t), which are used to compute the variance of decisions in Eq. (fTUI) are summarised 
in Table IIIII Averaging over the period and over the attractors, the variance of decisions in 
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TABLE I: The 4 attractors for m = 1, s = 2 in the multinomial regime. In Tables HI and HIl the 
time steps are relabeled with t = corresponding to the state with = and + 1) = 1, the 
superscripts it of the value indicate the possible signs to order N^^^"^, and A^^it) with asterisks 
correspond to the historical states, which are used to compute the variance of decisions in Eq. (|iup . 

Eq. (frnjl becomes 

^ ./V 

IV ^ 15^^14^^+7 + 414+5 + 424+3 + 154+, 

+ 2Cr+jCr+5 - 2cr+jCr+3 + 2ck+5Cr+3 " 2cr+5Cr+i). (27) 

Since the attractor sequence in Eq. ^ is related to Eq. (jH)) by conjugation symmetry, this 
expression is aheady the sample average of the variance. Again, the theoretical values of 
the first 3 points of each curve in Fig. |3] have an excellent agreement with the simulation 
results, and deviates from the scaling in the next regime. When 1, cr+i ~ cr+3 ~ 
~ cr+7 ~ ^y2/^TR, a"^ / N approaches 7/32-Kp. 

The variance of decisions for higher values of m can be obtained by exhaustive computer 
search starting from the 2^ quadrants of the phase space and the D initial states. Since the 
number of cases grows rapidly with D, one may use a Monte Carlo sampling of the initial 
conditions to determine the variance. 

Before we close this section, we remark that the periodic average of the decisions A^it) 
at the historical states p = have a vanishing sample average, but the periodic average 
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TABLE II: An atr actor for m = 2, s = 2 in the multinomial regime with the sequence in Eq. (jSj). 

does not necessarily vanish for individual samples. For example, the attractor (a) in Table H] 
has a periodic average of {A^it)) = — (cr+i + cr+3)/2 at the historical states /i = fi*(t). The 
variance is often regarded as a measure of the system efficiency, based on the observation that 
the average decisions vanish at high values of m . However, this is not the case for 

the low values of m we are studying. In the context of market modeling, a nonzero periodic 
average of decisions indicates the existence of arbitrage opportunities, and in the context 
of modeling multi-agent control, it means that there is an imbalance in the utilization of 
resources. Hence the variance cannot be regarded as an intrinsic measure of global efficiency. 
Nevertheless, the phase space motion points in the direction of reducing the winning margin, 
as seen in Eq. ()14|1 . which traps the attractors around the origin, as shown in Figs. Q and 
As a result, the average of decisions is bounded by the step sizes at the attractor, so that 
small variances also imply small averages, and the variance can still be considered as a good 
approximate measure of efficiency. 



IV. THE SCALING REGIME 



When p ~ 1, the clusters of agents making identical decisions effectively become con- 
tinuously distributed in their preference of strategies. Since the shift of preferences at the 
attractor is much narrower than the spread-out preference distribution, the size of the clus- 
ters switching strategies is effectively independent of the detailed profile of the preference 



16 



Attractor 


1 


2 


3 


4 


5 


6 


7 


8 


/i*(0) = 












0" 






= 1 


0- 


0- 


0- 


0- 


-CR+3 


-CR+5 


-CR+5 


-CR+7 


/i*(2) = 3 


0" 


-CR+5 


0" 


-CR+7 




-CR+3 


0^ 


-CR+5 


/i*(3) = 3 


CR+5 


0+ 


CR+7 


0+ 


CR+3 


0+ 


CR+5 


0+ 


/i*(4) = 2 


0- 


0- 


-CR+5 


-CR+7 






-CR+3 


-CR+5 


/i*(5) = 1 




CR+7 


CR+3 


CR+5 


0+ 


0+ 


0+ 


0+ 


/i*(6) = 2 


CR+3 


CR+5 


0+ 


0+ 


CR+5 


CR+7 


0+ 


0+ 


Ai*(7) = 


CR+1 


CR+3 


CR+3 


CR+5 


CR+3 


CR+5 


CR+5 


CR+7 


Attractor 


9 


10 


11 


12 


13 


14 


15 


16 


/i*(0) = 


-CR+1 


-CR+3 


-CR+3 


-CR+5 


-CR+3 


-CR+5 


-CR+5 


-CR+7 


/i*(l) = 1 




0- 


0" 


0" 


-CR+1 


-CR+3 


-CR+3 


-CR+5 


/i*(2) = 3 




-CR+3 


0" 


-CR+5 




-CR+1 


0" 


-CR+3 


/i*(3) = 3 


CR+3 


0+ 


CR+5 


0+ 


CR+1 


0+ 


CR+3 


0+ 


/i*(4) = 2 


0- 


0" 


—CR+3 


— CR+5 


0" 


0" 


— CR+1 


— CR+3 


/i*(5) = 1 


CR+3 




CR+1 


CR+3 


0+ 


0+ 


0+ 


0+ 


/i*(6) = 2 


CR+1 


CR+3 


0+ 


0+ 


CR+3 


CR+5 


0+ 


0+ 


A^*(7) = 


0+ 


0+ 


0+ 


0+ 


0+ 


0+ 


0+ 


0+ 



TABLE III: The values of A^{t) for the historical states = /U*(t) for the attractors with m = 2, 
s = 2 in the multinomial regime in Eq. 0. The time steps are relabeled with t = corresponding 
to the state with = and fi*{t + 1) = 1, the superscripts it of the value indicate the signs 
to order N^^/'^. 

distribution. For generic preference distributions, the width scales as y/R, and hence the 
size of typical clusters scales as This leads to the scaling of the variance a'^/N ~ 

[2^. Compared with the typical cluster size of scaling as in the multinomial regime, the 
typical cluster size in the scaling regime only scales as Nevertheless, it is sufficiently 

numerous that agent cooperation in this regime can be described at the level of statistical 
distributions of strategy preference, resulting in the scaling relation. 

In the integral of Eq. significant contributions only come from 6* ~ 1/ ^/R 01 9 — it ^ 
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so that the factor cos^ 6 can be approximated by exp(— i?0^/2). This simphfies 
Eq. (1221) to 

{AA^{t)) = -^sgnA^{t) (28) 

for fi = Since the step sizes scale as R~^^'^, they remain self- averaging. Similarly, 

(AA^it)) = using Eq. (j21|). The 2 cases can be summarized as 



AA'^(t) = AsgnA^(t). (29) 

This result shows that the preference distribution among agents of a given pair is effectively 
a Gaussian with variance R, so that the number of agents switching strategies at time t 
scales as 2 times the height of the Gaussian distribution (2 being the shift of preference per 
step), which is \f2jTiR. Thus by spreading the preference distribution, diversity reduces the 
step size and hence maladaptation. 

As a result of Eq. ^U^ . the motion in the phase space is rectilinear, each step only making 
a move of fixed size along the direction of the historical state. Consequently, each state of 
the attractor is confined in a D-dimensional hypercube of size ■\/2/7ri?, irrespective of the 
initial position of the A^ components. This confinement enables us to compute the variance 
of the decisions. Without loss of generality, let us relabel the time steps in the periodic 
attractor, with t = corresponding to the state with = and fi*{t + 1) = 1. We 

denote as the step at which state /i first appears in the relabeled sequence. (For example, 
to = 0, ti = 1, ^2 = 4 and ts = 2 for the attractor sequence in Eq. ©.) 

When state fi first appears in the attractor on or after t = 0, the winning state is 
cr{t^). Furthermore, since there is no phase space motion along the nonhistorical directions, 
A^(tfj) = A'^{0). Since the winning state is determined by the minority decision, we have 
A'^{0)[2a{tfj) — 1] < 0. Similarly, when state /i appears in the attractor the second time, the 



wmnmg state is 1 - a(t^), and A^'{t) = A^'{0) + [2a (t^) - 1\^2/tiR. The winning condition 
imposes that A^{t)[l — 2o{t^)\ < 0. Combining, 



±<A^{0)[2a{t,)-l]<0. (30) 
Suppose the game starts from the initial state Aq, which are Gaussian variables with mean 



and variance They change in steps of size yj2lT\:R until they reach the attractor. 
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whose 2D historical states are then given by 




frac 




(31) 



where frac(x) represents the decimal part of x. Using Eq. (fTIUl . this corresponds to a variance 
of decisions given by a'^/N = /(p)/27rp, where 



/(P) 




frac 



frac 




(32) 



Since Aq are independent variables, /(p) is simplified to 

2 



/(P) 



1 




frac 





frac 




(33) 



Since are Gaussian variables with mean and variance ^, we have 



frac 




E 



(34) 



When p <^ 1, the integrals are dominated by peaks at ,^ = and ^ = 1, yielding 
(frac(v/vri?/2A{;)) = {[fTac{^/^^R/2A^)]'^) = 1/2. As a result, /(p) = (1 - l/2D)/2. On 
the other hand, when p ^ 1, the step sizes become much smaller than the variance of 
Aq, so that frac( a/ tcR/2Aq) becomes a uniform distribution between and 1, leading to 
(frac(^7ri?/2A{;)) = 1/2 and {[bac{^TTR/2A^)f) = 1/3, resuhing in (1 - l/4D)/3 for 
p ^ 1. Hence /(p) is a smooth function of p varying, for example, from 3/8 to 7/24 for 
m = 1. Thus a'^/N depends on p mainly through the step size factor l/27rp, whereas /(p) 
merely provides a higher order correction to the functional dependence. This accounts for 
the scaling regime in Figs. El and |3] Furthermore, we note that /(p) rapidly approaches 1/3 
when m increases. Hence for general values of D, a'^/N l/67rp, provided that m is not 
too large. This leads to the data collapse of the variance for m = 1 and m = 2 in the inset 
of Fig. H 

Analogous to the multinomial regime, the hypercube picture implies that both the stan- 
dard deviation and the average of A^ are bounded by the step size. Hence the variance is a 
sufficient measure of system efficiency. 



19 



This result can be compared with that in 



12 1 , where it was found that the variance scales 



as a^/^ in the presence of random initial conditions. A similar oi^l'^ scaling was also reported 
for the batch MG Q]. Their results are different from ours that the variance is effectively 
independent of D (where a = D/N). However, the simulation data in Fig. |21 indicates that 
the difference may not be in conflict with each other. For a sufficiently large value of p, 
say p = 16, the data in the regime immediately below ac appears to be consistent with a 



When 



power-law dependence with an exponent approaching 0.5, as predicted by 
a reaches lower values, the variance flattens out, showing that our results are applicable to 
the regime of m being not too large. 



V. THE KINETIC SAMPLING REGIME 

When p ^ N, the average step sizes scale as A^~^ and are no longer self-averaging. Rather, 
Eq. ()14|1 shows that the size of a step along the direction of historical states at time t is2/N 
times the number of agents who switch strategies at time t, which is Poisson distributed 
with a mean A/2, implied by Eq. ()28|) . Here A is the average step size given by Eq. ()lip. 
However, since the attractor is formed by steps which reverse the sign of A'^, the average 
step size in the attractor is larger than that in the transient state, because a long jump is 
the vicinity of the attractor is more likely to get trapped. 

To consider the origin of this effect, we focus in Fig. IHl on how the average number 
of agents, who hold the identity strategy with = p and its complementary strategy 
0"^ = 1 — p, depends on the preference u + Qa — ^b, when the system reaches the steady 
state in games with m = 1. Since the preferences are time dependent, we sample their 
frequencies at a flxed time, say, immediately before t = in the inset of Fig. 13 One would 
expect that the bias distribution is reproduced. However, we flnd that a sharp peak exists 
at u + Qa — = This value of the preference corresponds to that of the attractor step 
from t = 3 to t = 0, when at state 0, decision wins and decision 1 loses, and uj + Qa — ^b 
changes from —1 to +1. The peak at the attractor step shows that its average step is 
self-organized to be larger than those of the transient steps described by the background 
distribution. Similarly for m = 2, Fig. [3 shows the average number of agents who hold the 
XOR strategy and its complement = — when the attractor sequence is Eq. 0. At 
the attractor step immediately before t = 4 in the inset of Fig. [3 the state is 1. Decision 1 
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FIG. 6: Experimental evidence of the kinetic sampling effect for m = 1: steady-state preference 
distribution of the average number of agents holding the identity strategy and its complement, 
immediately before t = 0, and p = N = 1023 and averaged over 100000 samples. Inset: The 
labeling of the time steps in the attractor. 

wins and decision loses, changing the preference cu + fla — from —1 to +1, and hence 
contributing to the sharp peak at cu + D,a ~ — ~1- 

This effect that favors the cooperation of larger clusters of agents is referred to as the 
kinetic sampling effect. To describe this effect, we consider the probability of Patt(AA) 
of step sizes AA in the attractor. For convenience, we only consider AA'^ > for all /x. 
Assuming that all states of the phase space are equally likely to be accessed by the initial 
condition, we have 

Patt(AA) = J]P,tt(AA,A), (35) 

A 

where Patt(AA, A) is the probabihty of finding the position A with displacement A A in the 
attractor. Consider the example of m = 1, where there is only one step along each axis A^^. 
The sign reversal condition implies that 

Patt(AA, A) = Ppoi(AA) H e[-A^{A^ + A^l'^)], (36) 

where Ppoi(AA) is the Poisson distribution of step sizes, yielding 

Patt(AA) (X Ppoi(AA) II AA''. (37) 
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FIG. 7: Experimental evidence of the kinetic sampling effect for m = 2: steady-state preference 
distribution of the average number of agents holding the XOR strategy and its complement 
immediately before t = 4, and p = N = 511 and averaged over 50000 samples. Inset: The labeling 
of the time steps in the attractor. 

We note that the extra factors of AA'^ favor large step sizes. Thus the attractor averages 
((AA''=)^)att, which are required for computing the variance of decisions, are given by 

Furthermore, correlation effects come into action when the step sizes become non-self- 
averaging. There are agents who contribute to both AA~^ and AA~ , giving rise to their 
correlations. Thus, the variance of decisions is higher when correlation effects are con- 
sidered. In Eq. (HD), the strategies of the agents contributing to AA"*" and AA~ satisfy 
C,a ~ ~ ='=2 and C,a — = T2 respectively. Among the agents contributing to AA'^, the 
extra requirement of ~ ^b — -F2 implies that an average of 1/4 of them also contribute 
to AA^ . Hence, the number of agents contributing to both steps is a Poisson variable with 
mean A/8. Similarly, the number of agents exclusive to the individual steps are Poisson 
variables with means 3 A/8. Algebraically, Eq. ()14j) can be decomposed as 

AA± = I 5^ 5^ Sabi-r - fi. + Qbm^ - + 2r)6{C - - 2r) 

a<b r=±l 

+1 E E - ^« + ^^^^^^a - + 2rmc - e?) + m - + 2r)]}. (39) 

a<b r=±l 

Respectively, the first and terms are equal to 2/N times the number of agents, common to 
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both steps AA^ and exclusive to the individual steps, with means A/8 and 3 A/8, as can 
be verified by a derivation similar to that of Eq. (j^^ from Eq. ()14|) . Hence the denominator 
of Eq. is given by 



ao! 



(AA+AA-)poi = — -ttI-) -r^( — ) — ^( — ) («o + a+)(ao + a_). 



a+! V 8 / a_! 



Expressing the moments of Poisson variables in terms of their means, we arrive at 

2 



(AA+AA-)poi = ^ 
Similarly, the numerator of Eq. ()38p is given by 



(40) 



(41) 



((AA±)2AA+AA-)poi = ^ 
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Together we obtain 



((A^")Oatt 



256(^V + 240 r^V + 40r^V + ^ 



(42) 



2A3 + 15A2 + 20A + 4 



Ar2(2A + 1) 

The possible attractor states are given by = rrin/N and m^/N — AA'^, where 
1, 3, . . . , N/S.A^ — 1. This yields a variance of 

2 



(43) 



Averaging over the attractor states, we find 

_ 7((iVAA+)^),tt + 7((iVAA-)^)att - 8 
N l^N 

which gives, on combining with Eq. ()43|) . 

^2 ^ ^4^3 ^ ^Q5^2 ^ ^32/\^ + 24 

iV ~ 96iV(2A + 1) ■ 

When the diversity is low, A ^ 1, and Eq. (jlUj) reduces to a"^ /N = 1 j^^iip^ agreeing with 
the scaling result of the previous section. When p ~ A^, Eq. (jmi) has excellent agreement 
with simulation results, which significantly deviate above the scaling relation, as shown in 
Fig. El 

When p ^ A^, Eq. (j46j) predicts that a"^ /N should approach 1/4A^. This can be explained 
as follows. Analysis shows that only those agents holding the identity strategy and its com- 
plement can complete both hops along the axes after they have adjusted their preferences 



D-l 



(44) 



(45) 



(46) 
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to io + Qa — = ±1- Since there are fewer and fewer fickle agents in the hmit p S> A^, 
one would expect that a single agent of this type would dominate the game dynamics, and 
a'^/N would approach l/AN. 

However, as shown in Fig. |2l the simulation data approaches the limit 0.43/A^ when 
p ^ N, significantly higher than 0.25/A^. This discrepancy requires the consideration of the 
waiting effect, which has been sketched in jlGj], and will be explained in details elsewhere. 

Next, we turn to the kinetic sampling effects for m = 2. As shown in Fig. ^b), the 
situation is more complicated than that of m = 1 since there are two steps moving along 
the direction and A^. Consider the attractor sequence in Eq. (jH)). The step AA{1) can 
initiate from A^ = rrii/N, with mi = —1, . . . , — NAA{1) + 1, where for convenience the 
state labels of the step sizes at time t are implicitly taken to be the historical states fi*{t). 
Similarly, the step AA{5) can initiate from A^ = m^/N, with = 1, . . . , NAA{5) — 1. 
However, since the two steps are linked by steps along the direction A"^, their positions are 
no longer independent. Taking into consideration the many possibilities of their relative 
displacements make the problem intractable. As shown in Fig. |H1 we only consider the 
most probable case that the two steps are symmetrically positioned, that is, their midpoints 
have the same A^ coordinate. In this case, the possible initial positions of the steps are 
A{1) = nii/N, with mi = -1, . . . , - [NAA{1) + NAA{5)]/2 + 1, and A{5) = m^/N, with 
ms = mi + [NAA{1) + A^Ay4(5)]/2. Thus, the number of possible states along the direction 
A^ is [A^Ay4(l)+A^AA(5)]/4. Considering the motion in the 4 directions, the total number of 
possible states is [NAA{0)/2] [{NAA{1) + NAA{5))/A] [NAA{2)/2] [{NAA{A) + NAA{6))/A]. 

Extending the derivation of Eq. ()45|1 to the case of m = 2, we have 



where the attractor averages are defined as the Poisson averages weighted by kinetic sam- 
pling. For example. 



This requires us to compute Poisson averages such as {AA{ti) ■ ■ ■ AA{tk))poi- The following 
identity for Poisson averages is useful. Consider a universal set of M elements, and the sizes 




(47) 



{AAior) 



att 



{AA{0)[AA{1) + AA{5)]AA{2)[AA{4) + AA{6)]AA{0f) 
(AA(0)[AA(1) + AA{5)]AA{2)[AA{A) + AA(6)])poi 



(48) 
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FIG. 8: The relative positions of the steps AA(1) and AA(5) for the case AA(5) > Ayl(l). Here 
they are shown symmetricahy positioned. 

of the sets Bi - ■ ■ Bk and their intersections are Poisson distributed. Then the expectation 
of the product |i?i| ■ ■ ■ |i?A:| is given by 

k k 

{m ■■■m) = + j^d^, n ^.d n (i^j) + ■ ■ ■ + (i fl ^^d- (^q) 

r=l r<s u^TS r=l 

This identity can be proved by writing 

M M 

■ ■ ■ \Bk\ = Y.---Y. e 5i) ■ ■ ■ Q{ik e Bk) (50) 

ii=l ifc=l 

where B(v E Br) if ir ^ Br and otherwise. In the limit of M approaching infinity, the case 
that all ir are distinct yields the expectation value in the first term of Eq. ()49|). the case that 
V = is corresponds to the second term, and the case that all v are identical corresponds to 
the last term, and so on. 
Therefore, we can write 

{AA{1) ■ ■ ■ AA{k)) =(^) lf[br + J2brsl[K + --- + (51) 

^ ^ \^r=l r<s Uy^rs J 

where br^.-.n is the average number of agents simultaneously contributing to the steps 
AA(ri) ■ • ■ Av4(ri). 

Consider the attractor sequence in Eq. (jSJ. Tracing the time evolution of the cumulative 
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payoffs, the step sizes at t = 2 and t = 6, for example, are given by 

^^(2) = ^ E E ^'^^(-^ - ^''(2) + mme - - 2r), (52) 

a<b r=±l 

^^(6) = I E E ^«''(-^' - ^-^(2) + n,{2) +ea- a -ea+ e,mea - c + 2/). (53) 

a<b r'=±l 

Following the analysis of Eq. ()39|) . we find 62 = &6 = ^/2. To find 626) we note that the 
agents shared by the two steps satisfy either r = r' and ~ ^1 ~ ~ ~ ~2r, or r = — r' 
and ea - 4' = 0, ea - eb = 2r. This leads to 

= E E - ^'^(2) + n,{2)))6{ea - ^b - 2r) 

a<b r=±l 

X ml - eb + 2r)5{C - il + 2r) + 5(C - 4^)<5(C^ - il - 2r)}. (54) 

The two terms in this expression consist of the contributions to Ay4(2), with the extra 
restrictions of ^—^1=^1—^1 = — 2r, or — = and i^ — il = 2r respectively. Since 
ia~^b ~ ='=2'^ and with probabilities 1/4 and 1/2 respectively, we get 626 = 3A/32. Other 
parameters are listed in Table IIVI This enables us to find 

(AA(0)[AA(1) + AA(5)][AA(4) + AA(6)]AA(2))poi 
= ^ (32A^ + 84A3 + l^A^ + 2a) . (55) 

Other expressions appearing in Eq. (jTTj) can be found similarly. The final result is 
_ 160A5 + 1680A4 + 4772A3 + 2Z|6i^2 ^ 7|3^ ^-^^ 
iV ~ 64Ar(32A3 + 84A2 + if A + 2) ' 

Since the attractor sequence in Eq. yields the same result, Eq. is the sample average 
of the variance. When the diversity is low, A ^ 1, and Eq. reduces to a"^ /N = 5/327rp, 
agreeing with the scaling result of the previous section. When p N, Eq. ()56|) shows that 
the introduction of kinetic sampling significantly improves the theoretical agreement with 
simulation results, as shown in Fig. EJ When p ^ N, Eq. (j56|) implies that a'^/N approaches 
17/128iV. This result is not valid since it is below the lowest possible result of 1/4A^ when 
each step is excuted by the strategy switching of only one agent. The discrepency can be 
traced to the approximation that the average number of states along the direction is 
[A^Ay4(l) + A^A(5)]/2, which is not precise for small steps. For example, it can take half 
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60, bi,b2, 64, 65, be 


A/4 




A/8 


^06 1 bi4, biQ, 624) 645 ) &56 


3A/32 


bo2, ^04) 605 ) &25) &26 


A/16 


boi5, ^046) ^125) ^246 


A/32 


^012) ^014, 60I6) ^056) 6124 ) 6l26) ^245 


3A/128 


6024 ) ^026 


A/64 


&025 


A/128 


60124, 60126 


A/64 


&0125i &0246 



TABLE IV: Values of 6t^...(^ for the attractor sequence in Eq. (jH}. The steps at t = 3 and t = 4 
are identical, so are the steps at t = 6 and t = 7. Other unhsted parameters are zero. 



integer values. We will not pursue this issue further since, in any case, waiting effects have 
to be taken into account in analysing the case p ^ N. 

In summary, we have explained the reduction of variance by the reduction of the fraction 
of fickle agents when diversity increases. The theoretical analysis from Sections llIII tolVlspans 
the 3 regimes of small R, scaling, and kinetic sampling, yielding excellent argreement 
with simulations over 7 decades. 

It is natural to consider whether the results presented here can be generalized to the 
case of the exogenous MG, in which the information p{t) was randomly and independently 
drawn at each time step t from a distribution = 1 /D [6] . This is different from the present 
endogenous version of the MG, in which the information is determined by the sequence of 
the winning bits in the game history. The similarities and differences between the behavior 
ot .wo — W been a .op. of ..e.e. ,n .Ke ,.e...e flQ y y QHQ • 



Here we compare their behavior in games of small m using the phase space we introduced. 

In the scaling regime, the picture that the states of the game are hopping between hy- 
percubes in the phase space remains valid, as shown in Fig. El for m = 1. At the steady 
state, the attractor consists of hoppings along all edges of a hypercube, in contrast to the 
endogenous case, in which only a fraction of hypercube vertices belong to the attractor. 
The behavior in the scaling regime depends on the scaling of the step sizes with diversity, 
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FIG. 9: An attractor of the exogenous Minority Game for m = 1. 



rather than the actual sequence of the steps. Consequently, the behavior is the same as the 
endogenous game. In the kinetic sampling regime, the physical picture that larger steps are 
more likely to be trapped remains valid, and the behavior remains qualitatively similar to 
the endogenous case. 

VI. THE FRACTION OF FICKLE AGENTS 

This physical picture of the diversity effects is further illustrated by considering the 
fraction /g of fickle agents when the game has reached the steady state. They hold strategy 
pairs whose preferences are distributed near zero, and change sign during the attractor 
dynamics. As confirmed in Figs. El and three regimes of behavior exist. 

In the multinomial regime, we can make use of the explicit knowledge about the attractor 
sequence and the evolution of the payoffs in the attractor dynamics. Consider the example 
of m = 1. We count the type of fickle agents labeled by the strategy pairs a < b and bias lj 
for all t, with preferences 



LJ + Qait) - Qbit) = ±1 



and C~^b=T2sgnA^{t) 



(57) 



where fj, = fj.*(t). Equivalently, we have 



i(2a(t)-i)(ef«-er^*^ 



(58) 



where Qait) is updated by 



na{t + i) = na{t) + ^f^'^[2a{t)-i]. 



(59) 
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FIG. 10: The dependence of the fraction of fickle agents on the randomness R at m = 1 and s = 2. 
Notations are the same as those of Fig. |HJ 
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FIG. 11: The dependence of the fraction of fickle agents on the randomness R sX m = 2 and s = 2. 
Notations are the same as those of Fig. |21 

This enables us to count the types directly from the knowledge of the attractor sequences, 
such as Eqs. (j?)) and (jH)), without having to know the step sizes. Results for m = 1 and 
m = 2 are listed in Tables IVl and IVll respectively. Note that the values in the tables depend 
on the convention of ordering the strategies a < b, and here the convention of Eq. is 
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adopted. Other conventions may classify the types with bias uj as — cu, or vice versa. Since 
the average number of fickle agents of each type is given by Eq. 0, /a can then be obtained 
by summing up the contribution from each type. 



u 


(a) 


(b) (c) 


(d) 


Total 


-3 


1 











1 


-1 


5 


4 





3 


12 


1 


1 


3 


6 


3 


13 


3 








1 


1 


2 


Total 


7 


7 


7 


7 





TABLE V: The number of types of fickle agents for the attractors (a)-(d) in Fig. |21 



Consider the example of m = 1. Table |3 shows that there are 7 types of fickle agents 
for each attractor shown in Fig. El Averaging over initial states, we find that an average of 
25/4 types consist of agents with biases = ±1, and an average of 3/4 types with uj = ±3, 
this result being independent of the ordering of a < 6. Since the average number of agents 
holding strategy pair a < 6 is A^/8, we have 



_ 25 / i? \^ 3_f R\J_ 



(60) 



For m = 2, the number of types of fickle agents for the 16 attractors in Table UTTl are listed 
in Table IVII There are 194 types of fickle agents for each attractor. The fraction of fickle 
agents is given by 



1121 f R \ 1 



+ 



373 



1024 y 2« 1024 ^ 2 



/i?\l 55/i?\l 3 f R \ 1 



2 / ^^^^ y 2 

In the scaling regime p ~ 1, we consider the limit of i? ~ in Eq. (jUUI) . and obtain for 
m = 1, 

h = l\f^- (62) 



Similarly, from Eq. ()6H1 . we have for 



m 



(63) 
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Total 


194 194 194 194 194 194 194 194 194 194 194 194 194 194 194 194 





TABLE VI: The number of types of fickle agents for the 16 attractors in Table ITTll at m = 2. 

In the kinetic sampling regime, the fraction of fickle agents for m = 1 is obtained by 
replacing (AA^)^ in the numerator of Eq. by (ao + a+ + a_)/N, following the notation 
used in Eq. fHUj) . The result is 

14A2 + 39A + 8 



(64) 



8iV(2A + 1) 

In the limit of low diversity, A ^ 1 and Eq. (jMj) reduces to Eq. (jHH)- In the limit of high 
diversity, A ^ 1 and /g approaches implying that a single agent would dominate the 
game dynamics. However, since waiting effects are neglected, this result is considerably 
lower than the simulation results. 

For m = 2, the fraction of fickle agents is given by the size of the union set of fickle agents 
at all steps. 



where 

(^ri---ri )att 

The result is 



\ r r<s r<s<u / ^^-^ 

(AA(0)[AA(1) + AA(5)]AA(2)[AA(4) + AA(6)]&,,...,,)poi 
(AA(0)[AA(1) + AA(5)]AA(2)[AA(4) + AA(6)])poi ' 

1552A^ + 8170A3 + sop A^ + 2801A + 64 
32iV(32A3 + 84A2 + if A + 2) ' 



(65) 



(66) 



(67) 
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In the limit of low diversity, A ^ 1 and Eq. ()67|) reduces to Eq. ()63|) . In the limit of 
high diversity, /a approaches However, by tracing the types of fickle agents switching 

strategies at each time step, one cannot find any single type of agents who can contribute 
to the dynamics of all steps. In fact, the minimum number of agents that can complement 
each other to complete the dynamics is 2. For example, one agent can complete the steps 
at t = 0, 1, 2, 3, 4, while the other one can complete the steps t = 5, 6, 7. Hence the 
asymptotic limit of /g = is not valid. The source of the discrepancy is the same as 
that for the invalid result of the asymptotic variance of decisions explained in the previous 
section. 

As shown in Figs. ^1 and the theoretical predictions are confirmed by simulations, 
except in the regime of extremely high diversity, where waiting effects have to be taken into 
account 'IGl. 



VII. CONVERGENCE TIME 



Many properties of the system dependent on the transient dynamics also depend on its 
diversity. For example, since diversity reduces the fraction of agents switching strategies at 
each time step, it also slows down the convergence to the steady state. Hence the convergence 
time increases with diversity. 

We consider the example of m = 1 . The dynamics of the game proceeds in the direction 
which reduces the variance 0]. In the multinomial regime, the initial position of in the 
phase space lies in the attractor. Convergence to the steady state is almost instant. Starting 
from the initial state 0, the convergence time is 2, 0, 0, 1 in the 4 respective quadrants of 
the phase space in Fig. ^ For the initial state 1, the game has the same set of convergence 
times, except that the order described is permuted. Hence, the convergence time is 2, 1 and 
with probabilities 1/4, 1/4 and 1/2 respectively, yielding the average convergence time of 
3/4. 

In the scaling regime, it is convenient to make use of the rectilinear nature of the motion 



in the phase space. We divide the phase space into hypercubes with dimensions ^Jlj-nR. 
Starting from the initial state 0, the convergence paths are shown in Fig. El The convergence 
time r of an initial state from inside a hypercube is the number of steps it hops between the 
hypercubes on its way to the attractor, as shown in Fig. ^1 
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FIG. 12: The convergence paths starting from the initial state in the 4 quadrants of the phase 
space for m = 1. 
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FIG. 13: The dependence of the convergence time on the initial position in the phase space for 
m = 1, starting from the initial state 0. The dimensions of the hypercubes are \/2/ttR. Inset: The 
3 regimes of convergence time in the continuum limit. 

In general, the convergence time is given by the following cases: (a) 3a; + y + 2 for a; > 
and y > —x — 1, where x - 



'^A\0) andy= ^ ^^^^(O) ; (b) -x - 3y -4 for y < -2 
and y < —X — 2; (c) — x + y — 1 for x < —2 and y > —1; (d) y ior x — —1 and y > 0; (e) 
for x = y = —1. 
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The average convergence time is then obtained by averging over the Gaussian distribution 
of the initial A^{0) with mean and variance When p is small, the initial positions 

are mainly distributed around the origin, reducing the convergence time to that of the 
multinomial regime. When p is large, the initial positions are broadly distributed among 
many hypercubes in the phase space, and one can take a continuum approximation as shown 
in the inset of Fig. ^| Thus, the average convergence time is given by 

/O p-y 
Dy / Dx{—x — 3y) 
OO J — OO 

Dy{-x + y)^, (68) 

where Dx = dx e~'^/V27r is the Gaussian measure. The result is 

T = {2 + V2)y^. (69) 




As shown in Fig. El there is an excellent agreement between theory and simulations. 

The p^/^ dependence of the convergence time can be interpreted as follows. In the scaling 
regime, since the step size in the phase space scales as 1/ ^/R and the initial position of A'^ has 
components scaling as 1/\/N, the convergence time should scale as (l/-\/iV)/(l/-\/i?) ~ p^^'^. 
This scaling relation remains valid in the kinetic sampling regime where p ^ N, since kinetic 
sampling only affects the description of the attractor, rather than the transient behavior. 



VIII. WEALTH SPREAD 



Another system property dependent on the transient is the distribution of wealth or 
resources, especially those among the frozen agents (that is, agents who do not switch their 
strategies at the steady state). Since the system dynamics reaches a periodic attractor, 
they have constant average wealth at the steady state. Hence any spread in their wealth 
distribution is a consequence of the transient dynamics. 

To simiplify the analysis, we only consider the agents who hold identical strategy pairs. 
Since they never switch strategies, and both outputs 1 and have equal occurence at the 
attractor, their wealth averaged over a period becomes a constant, and their wealth is equal 
to the cumulative payoff of the identical strategies they hold. 
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In the multinomial regime, the wealth of agents holding identical strategies a is given by 
Eq. (fTBj) . where are listed in Table HI For m = 1, the periodic average {VLa)t of the 

cumulative payoffs of strategies and their variances {{Qa)t)a are listed in Table rVTTl Thus, 
the wealth spread W is the variance {{Qa)t)a of {^a)t, averaged over the four strategies and 
the four attractors, and is equal to 5/8. 





£l £0 


(a) (b) (c) (d) 


{^2)1 


-1 -1 
1 -1 

-1 1 
1 1 


10-10 

1113 
2 2 2 2 

1113 
2 2 2 2 

-10 10 


(Ma 




5 15 9 
8 8 8 8 



TABLE VII: The variance of the periodic average of wealth of the 4 strategies, for the 4 

attractors of m = 1. 

In the scaling regime, the initial position may be located away from the origin of the 
phase space. Using the hypercube picture of the transient motion, we can work out the 
cumulative payoffs of the strategies by considering their changes when their initial position 
shift to successive neighboring hypercubes. The distribution of wealth variance is shown in 



Fig. El In general, if x 



and y 



^^0(0) 



then the average wealth 

1 



of the 4 strategies in Table IVIII are x + y + 1, —x + y — 1/2, x — y + 1/2 and —x — y 
respectively. This leads to a wealth spread of x^ + y"^ + 3x /2 + y /2 + 5/8. 

The value of W is then obtained by averaging the wealth spread over the Gaussian 
distribution of the initial positions in the phase space, each component A'^{0) with mean 
and variance 1/A^. When p is small, the initial positions are mainly distributed around 
the origin, reducing the wealth spread W to the value at the multinomial regime. When p 
is large, the initial positions are broadly distributed among many hypercubes in the phase 
space. Applying the continuum approximation. 



W 



irR 
2N 



Dx / Dy{x^ + y' 



7rp. 



(70) 



The same scaling relation applies to the kinetic sampling regime. As shown in Fig. ITHl 
agreement between theory and simulations is excellent. Note that the behavior closely 
resembles that of the convergence time in Fig. El showing that it is a transient behavior. 
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FIG. 14: The dependence of the average convergence time on the diversity at m = 1. 
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FIG. 15: The dependence of the variance of wealth among the agents holding identical strategies 
on the initial position in the phase space for m = 1. The dimensions of the hypercubes are ^/2/TrR. 

IX. DISCUSSIONS 

We have studied the effects of diversity in the initial preference of strategies on a game 
with adaptive agents competing selfishly for finite resources. Introducing diversity is both 
useful in modeling agent behavior in economic markets, and as a means to improve dis- 
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FIG. 16: The dependence of the variance of wealth on the diversity among the agents holding 
identical strategies for m = 1. 

tributed control. We find that it leads to the emergence of a high system efficiency. We have 
made use of the small memory sizes m to visualize the motion in the phase space. Scaling of 
step sizes accounts for the dependence of the efficiency on the diversity in the scaling regime 
(p ~ 1), while kinetic sampling effects have to be considered at higher diversity, yielding the- 
oretical predictions with excellent agreement with simulations up to p ~ A^. However, when 
diversity increases further, waiting effects have to be considered jl^ and will be discussed 
in details elsewhere. The variance of decisions decreases with diversity, showing that the 
maladaptive behavior is reduced. On the other hand, the convergence time and the wealth 
spread increases with diversity. 

While the present results apply mostly to the cases of small m, qualitative predictions can 
be made about higher values of m. An extension of Eq. ()23|) shows that when a increases, 
the step size becomes smaller and smaller in the asymptotic limit. There is a critical slow 



3 



. When a exceeds Oc, 



down since the convergence time diverges at etc = ^r^^ = 0.3183 
the step size vanishes before the system reaches the attractor near the origin, so that the 
state of the system is trapped at locations with at least some components being nonzero. 
The interpretation is that when a is large, the distribution of strategies become so sparse 
that motions in the phase space cannot be achieved by the switching of strategies. This 
agrees with the picture of a phase transition from the symmetric to asymmetric phase when 

n 

a increases [21]. It is interesting to note that the value of ac is close to the value of 0.3374 



obtained by the continuum approximation 



27l | or batch update [15.] using linear payoff 
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functions. 

Another extension to general m applies to the symmetric phase of the exogenous game. 
In this case the attractor can be approximated by a hyperpolygon enclosing the origin of 
the phase space. Using a generating function approach, we have computed the variance of 
decisions, taking into account the scaling of step sizes and kinetic sampling; the analysis 
will be presented elsewhere. The results agree qualitatively with simulations of both the 
exogenous and endogenous games, except for values of a close to etc. In fact, when a in- 
creases, there is an increasing fraction of samples in which the attractors are more complex 
than hyperpolygons. For example, in the endogenous case, there is an increasing fraction 
of attractors whose periods are no longer 2D [28]. Instead, their periods become multiples 
of the fundamental period 2D. It is remarkable that the population variance is not seri- 
ously affected by the structural change of the attractor, probably because the dynamical 
description of such long-period attractors have strong overlaps with those of several distinct 
attractors of period 2D. 

Besides step payoffs, the case of linear payoffs is equally interesting. In fact, the latter case 
has also been considered recently, and the variance of decisions is also found to decrease with 
diversity 29j. There are significant differences between the two cases, though, indicating that 
agents striving to maximize different payoffs cause the system to self-organize in different 
fashions. The details will be explained elsewhere. 

From the viewpoint of game theory, it is natural to consider whether the introduction 
of diversity assists the game to reach a Nash equilibrium, in contrast to the case of the 
homogeneous initial condition where maladaptation is prevalent. It has been verified that 
Nash equilibria consist of pure strategies Hence all frozen agents have no incentives to 
switch their strategies. In fact, since the dynamics in the attractor is periodic for small 
m, with states ±1 appearing once each in response to each historical string, the payoffs 
of all strategies become zero when averaged over a period. Thus, the Nash equilibrium is 
approached in the sense that the fraction of fickle agents decreases with increasing diversity. 
In the limit of p ^ N, it is probable that only one fickle agent switches strategy at each step 
in the attractor, as predicted by Eq. (jMj) for the case m = 1. In this case, agents who switch 
their decisions cannot increase their payoffs, since on switching, the minority ones would 
become losers, and the majority ones would change the minority side to majority and lose. 
(Though the fickle agents are not playing pure strategies, this argument implies that their 
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payoffs are the same as if they are doing so.) Then a Nash equihbrium is reached exactly. 
However, as mentioned previously, waiting effects become important in the extremely diverse 
limit, and there are cases that more than one fickle agent contribute to a single step in the 
attractor dynamics, and Nash equilibrium cannot be reached. 

The combination of scaling and kinetic sampling in accounting for the steady state prop- 
erties of the system illustrates the importance of dynamical considerations in describing the 
system behavior, at least for small values of m. We anticipate that these dynamical effects 
will play a crucial role in explaining the system behavior in the entire symmetric phase, 
since when a increases, the state motion in a high dimensional phase space can easily shift 
the tail of the cumulative payoff distributions to the verge of strategy switching, leading to 
the sparseness condition where kinetic sampling effects are relevant. Due to their generic 
nature inherent in multi-agent systems with dynamical attractors formed by the collective 
actions of many adaptive agents, we expect that these effects are relevant to minority games 
with different payoff functions and updating rules, as well as other multi-agent systems with 
adaptive agents competing for limited resources. 

The sensitivity of the steady state to the initial conditions has implications to adaptation 
and learning in games. First, when the MG is used as a model of financial markets, it shows 
that the maladaptive behavior is, to a large extent, an artifact of the homogeneous initial 
condition. In practice, when agents enter the market with diverse views on the values of 
the strategies, the corresponding initial condition should be randomized, and the market 
efficiency is better than previously believed. Second, when the MG is used as a model of 
distributed load balancing, the present study illustrates the importance to adopt diverse 
initial conditions in order to attain the optimal system efficiency. The effect is reminiscent 
of the dynamics of learning in neural networks, in which case an excessive learning rate 
might hinder the convergence to optimum jsol . 
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